89 research outputs found

    Probabilistic Safety Regions Via Finite Families of Scalable Classifiers

    Full text link
    Supervised classification recognizes patterns in the data to separate classes of behaviours. Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning. The data analyst may minimize the classification error on a class at the expense of increasing the error of the other classes. The error control of such a design phase is often done in a heuristic manner. In this context, it is key to develop theoretical foundations capable of providing probabilistic certifications to the obtained classifiers. In this perspective, we introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled. The notion of scalable classifiers is then exploited to link the tuning of machine learning with error control. Several tests corroborate the approach. They are provided through synthetic data in order to highlight all the steps involved, as well as through a smart mobility application.Comment: 13 pages, 4 figures, 1 table, submitted to IEEE TNNL

    Rule-based Out-Of-Distribution Detection

    Get PDF
    Out-of-distribution detection is one of the most critical issue in the deployment of machine learning. The data analyst must assure that data in operation should be compliant with the training phase as well as understand if the environment has changed in a way that autonomous decisions would not be safe anymore. The method of the paper is based on eXplainable Artificial Intelligence (XAI); it takes into account different metrics to identify any resemblance between in-distribution and out of, as seen by the XAI model. The approach is non-parametric and distributional assumption free. The validation over complex scenarios (predictive maintenance, vehicle platooning, covert channels in cybersecurity) corroborates both precision in detection and evaluation of training-operation conditions proximity. Results are available via open source and open data at the following link: https://github.com/giacomo97cnr/Rule-based-ODD

    Rule-based Out-Of-Distribution Detection

    Full text link
    Out-of-distribution detection is one of the most critical issue in the deployment of machine learning. The data analyst must assure that data in operation should be compliant with the training phase as well as understand if the environment has changed in a way that autonomous decisions would not be safe anymore. The method of the paper is based on eXplainable Artificial Intelligence (XAI); it takes into account different metrics to identify any resemblance between in-distribution and out of, as seen by the XAI model. The approach is non-parametric and distributional assumption free. The validation over complex scenarios (predictive maintenance, vehicle platooning, covert channels in cybersecurity) corroborates both precision in detection and evaluation of training-operation conditions proximity. Results are available via open source and open data at the following link: https://github.com/giacomo97cnr/Rule-based-ODD

    eXplainable and Reliable Against Adversarial Machine Learning in Data Analytics

    Get PDF
    Machine learning (ML) algorithms are nowadays widely adopted in different contexts to perform autonomous decisions and predictions. Due to the high volume of data shared in the recent years, ML algorithms are more accurate and reliable since training and testing phases are more precise. An important concept to analyze when defining ML algorithms concerns adversarial machine learning attacks. These attacks aim to create manipulated datasets to mislead ML algorithm decisions. In this work, we propose new approaches able to detect and mitigate malicious adversarial machine learning attacks against a ML system. In particular, we investigate the Carlini-Wagner (CW), the fast gradient sign method (FGSM) and the Jacobian based saliency map (JSMA) attacks. The aim of this work is to exploit detection algorithms as countermeasures to these attacks. Initially, we performed some tests by using canonical ML algorithms with a hyperparameters optimization to improve metrics. Then, we adopt original reliable AI algorithms, either based on eXplainable AI (Logic Learning Machine) or Support Vector Data Description (SVDD). The obtained results show how the classical algorithms may fail to identify an adversarial attack, while the reliable AI methodologies are more prone to correctly detect a possible adversarial machine learning attack. The evaluation of the proposed methodology was carried out in terms of good balance between FPR and FNR on real world application datasets: Domain Name System (DNS) tunneling, Vehicle Platooning and Remaining Useful Life (RUL). In addition, a statistical analysis was performed to improve the robustness of the trained models, including evaluating their performance in terms of runtime and memory consumption

    CONFIDERAI: a novel CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial Intelligence

    Full text link
    Everyday life is increasingly influenced by artificial intelligence, and there is no question that machine learning algorithms must be designed to be reliable and trustworthy for everyone. Specifically, computer scientists consider an artificial intelligence system safe and trustworthy if it fulfills five pillars: explainability, robustness, transparency, fairness, and privacy. In addition to these five, we propose a sixth fundamental aspect: conformity, that is, the probabilistic assurance that the system will behave as the machine learner expects. In this paper, we propose a methodology to link conformal prediction with explainable machine learning by defining CONFIDERAI, a new score function for rule-based models that leverages both rules predictive ability and points geometrical position within rules boundaries. We also address the problem of defining regions in the feature space where conformal guarantees are satisfied by exploiting techniques to control the number of non-conformal samples in conformal regions based on support vector data description (SVDD). The overall methodology is tested with promising results on benchmark and real datasets, such as DNS tunneling detection or cardiovascular disease prediction.Comment: 12 pages, 7 figures, 1 algorithm, international journa

    Dual-View Single-Shot Multibox Detector at Urban Intersections: Settings and Performance Evaluation

    Get PDF
    The explosion of artificial intelligence methods has paved the way for more sophisticated smart mobility solutions. In this work, we present a multi-camera video content analysis (VCA) system that exploits a single-shot multibox detector (SSD) network to detect vehicles, riders, and pedestrians and triggers alerts to drivers of public transportation vehicles approaching the surveilled area. The evaluation of the VCA system will address both detection and alert generation performance by combining visual and quantitative approaches. Starting from a SSD model trained for a single camera, we added a second one, under a different field of view (FOV) to improve the accuracy and reliability of the system. Due to real-time constraints, the complexity of the VCA system must be limited, thus calling for a simple multi-view fusion method. According to the experimental test-bed, the use of two cameras achieves a better balance between precision (68%) and recall (84%) with respect to the use of a single camera (i.e., 62% precision and 86% recall). In addition, a system evaluation in temporal terms is provided, showing that missed alerts (false negatives) and wrong alerts (false positives) are typically transitory events. Therefore, adding spatial and temporal redundancyincreases the overall reliability of the VCA system

    On the Intersection of Explainable and Reliable AI for physical fatigue prediction

    Get PDF
    In the era of Industry 4.0, the use of Artificial Intelligence (AI) is widespread in occupational settings. Since dealing with human safety, explainability and trustworthiness of AI are even more important than achieving high accuracy. eXplainable AI (XAI) is investigated in this paper to detect physical fatigue during manual material handling task simulation. Besides comparing global rule-based XAI models (LLM and DT) to black-box models (NN, SVM, XGBoost) in terms of performance, we also compare global models with local ones (LIME over XGBoost). Surprisingly, global and local approaches achieve similar conclusions, in terms of feature importance. Moreover, an expansion from local rules to global rules is designed for Anchors, by posing an appropriate optimization method (Anchors coverage is enlarged from an original low value, 11%, up to 43%). As far as trustworthiness is concerned, rule sensitivity analysis drives the identification of optimized regions in the feature space, where physical fatigue is predicted with zero statistical error. The discovery of such “non-fatigue regions” helps certifying the organizational and clinical decision making

    A novel method to derive personalized minimum viable recommendations for type 2 diabetes prevention based on counterfactual explanations

    Get PDF
    Despite the growing availability of artificial intelligence models for predicting type 2 diabetes, there is still a lack of personalized approaches to quantify minimum viable changes in biomarkers that may help reduce the individual risk of developing disease. The aim of this article is to develop a new method, based on counterfactual explanations, to generate personalized recommendations to reduce the one-year risk of type 2 diabetes. Ten routinely collected biomarkers extracted from Electronic Medical Records of 2791 patients at low risk and 2791 patients at high risk of type 2 diabetes were analyzed. Two regions characterizing the two classes of patients were estimated using a Support Vector Data Description classifier. Counterfactual explanations (i.e., minimal changes in input features able to change the risk class) were generated for patients at high risk and evaluated using performance metrics (availability, validity, actionability, similarity, and discriminative power) and a qualitative survey administered to seven expert clinicians. Results showed that, on average, the requested minimum viable changes implied a significant reduction of fasting blood sugar, systolic blood pressure, and triglycerides and a significant increase of high-density lipoprotein in patients at risk of diabetes. A significant reduction in body mass index was also recommended in most of the patients at risk, except in females without hypertension. In general, greater changes were recommended in hypertensive patients compared to non-hypertensive ones. The experts were overall satisfied with the proposed approach although in some cases the proposed recommendations were deemed insufficient to reduce the risk in a clinically meaningful way. Future research will focus on a larger set of biomarkers and different comorbidities, also incorporating clinical guidelines whenever possible. Development of additional mathematical and clinical validation approaches will also be of paramount importance

    From Explainable to Reliable Artificial Intelligence

    Get PDF
    Artificial Intelligence systems are characterized by always less interactions with humans today, leading to autonomous decision-making processes. In this context, erroneous predictions can have severe consequences. As a solution, we design and develop a set of methods derived from eXplainable AI models. The aim is to define “safety regions” in the feature space where false negatives (e.g., in a mobility scenario, prediction of no collision, but collision in reality) tend to zero. We test and compare the proposed algorithms on two different datasets (physical fatigue and vehicle platooning) and achieve quite different conclusions in terms of results that strongly depend on the level of noise in the dataset rather than on the algorithms at hand
    • …
    corecore